Skip to main content
Feedback

Cluster problems

Monitor the status of your runtime cluster and runtime cloud nodes using the Cluster Status panel on the Runtime Management page (Manage > Runtime Management).

  1. Click the Cluster Issues tab on the Cluster Status panel, to view cluster problem reports. If that tab is not present, there are not currently any reported cluster problems.

The following table shows information about all of the different types of reportable cluster problems.

note

Cluster problem reports, including a node’s “problem” property values, are written to the node's node.localhostid.dat file (also known as the “view snapshot” file). The values of “problem” properties also appear in the container log file.

Problem property valueImportanceDescriptionResolution/Troubleshooting
CLUSTER_STATE_MISMATCHWarningDifferent nodes in the cluster have differing cluster state information.Restart the problem node if issue persists.
CONTAINER_VERSION_MISMATCHWarningThere are differing container versions (build numbers) across the various view snapshots.Restart all nodes if issue persists.
DIFFERENT_NODESWarningTwo views which otherwise seem to agree (view ID and head node) do not have all the same nodes.Check for cluster communication problems if issue persists.
HEAD_AWOLSevere errorThe head node, according to the live view, does not have a corresponding view snapshot.Check for network issues and communication problems.
HEAD_SUSPECTWarningLike HEAD_AWOL, but some nodes are still starting up so this could be a timing issue.Wait for nodes to finish starting up.
HEAD_SUSPECT_ESCALATEDSevere errorThe HEAD_SUSPECT warning is escalated to a severe warning.If issue persists, node is forcefully deleted.
LOCALHOSTID_CONFLICTSevere errorMultiple nodes are writing to the same view snapshot file, indicating a conflict in localHostId.Ensure nodes have unique localHostIds.
MINIMUM_CLUSTER_SIZESevere errorA node is waiting to restart but the cluster has reached its minimum allowable size.Wait for the number of active nodes in the cluster to increase.
MULTIPLE_HEAD_NODESSevere errorThere is more than one head node in the various view snapshot files.Check for network and configuration issues.
NODE_AWOLSevere errorOne or more nodes in the live view do not have a corresponding view snapshot.Similar to the HEAD_AWOL problem, except that the missing nodes are not the head node.
NODE_DOWNWarningA node is either not running and did not remove its view file, or is hanging and no longer updating its view file.Remove offending file or restart node.
NODE_SUSPECT WarningLike NODE_AWOL, but some nodes are still starting up so this could be a timing issue.Wait for nodes to finish starting up.
ORPHANED_NODESevere errorThe head node's view snapshot does not include this node.Check for network and communication issues.
READ_FAILUREWarningCould not read a view snapshot file.Check file system and container logs.
ROLLING_RESTART_*WarningIncludes ROLLING_RESTART_MULTIPLE_HEAD_NODES, ROLLING_RESTART_VIEW_ID_MISMATCH, ROLLING_RESTART_VIEW_FILE_MISMATCH, ROLLING_RESTART_JAVA_HOME_MISMATCH,ROLLING_RESTART_ORPHANED_NODE, ROLLING_RESTART_HEAD_AWOL, and ROLLING_RESTART_NODE_AWOL. These issues are generally considered severe but are downgraded to warnings during a rolling restart.These issues can generally be ignored while a rolling restart is in progress as they are often transient. If they persist after the restart is complete, then they should be investigated as they would be if they appeared outside of a rolling restart.
UNEXPECTED_HEAD_NODE_CHANGESevere errorOne of the nodes changed its head node to a different one without receiving a head node change notification.Check for network latency or partition issues.
VIEW_ID_CONFLICTSevere errorTwo different nodes have the same view ID.Check network settings and view files.
VIEW_ID_MISMATCHWarningThe head node has a view ID that does not match other nodes' view IDs.Check for network issues and consider restarting nodes.
WRITE_FAILURESevere errorCould not write to a view snapshot file.Check file system and container logs.

For more information on Cluster monitoring refer to the following links: